In [1]:
pip install plotly_express
Requirement already satisfied: plotly_express in c:\users\drake\anaconda3\lib\site-packages (0.4.1)
Requirement already satisfied: pandas>=0.20.0 in c:\users\drake\anaconda3\lib\site-packages (from plotly_express) (2.2.2)
Requirement already satisfied: numpy>=1.11 in c:\users\drake\anaconda3\lib\site-packages (from plotly_express) (1.26.4)
Requirement already satisfied: statsmodels>=0.9.0 in c:\users\drake\anaconda3\lib\site-packages (from plotly_express) (0.13.2)
Requirement already satisfied: scipy>=0.18 in c:\users\drake\anaconda3\lib\site-packages (from plotly_express) (1.9.1)
Requirement already satisfied: patsy>=0.5 in c:\users\drake\anaconda3\lib\site-packages (from plotly_express) (0.5.2)
Requirement already satisfied: plotly>=4.1.0 in c:\users\drake\anaconda3\lib\site-packages (from plotly_express) (5.9.0)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\drake\anaconda3\lib\site-packages (from pandas>=0.20.0->plotly_express) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\drake\anaconda3\lib\site-packages (from pandas>=0.20.0->plotly_express) (2022.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\drake\anaconda3\lib\site-packages (from pandas>=0.20.0->plotly_express) (2024.1)
Requirement already satisfied: six in c:\users\drake\anaconda3\lib\site-packages (from patsy>=0.5->plotly_express) (1.16.0)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\drake\anaconda3\lib\site-packages (from plotly>=4.1.0->plotly_express) (8.0.1)
Collecting numpy>=1.11
  Using cached numpy-1.24.4-cp39-cp39-win_amd64.whl (14.9 MB)
Requirement already satisfied: packaging>=21.3 in c:\users\drake\anaconda3\lib\site-packages (from statsmodels>=0.9.0->plotly_express) (21.3)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\drake\anaconda3\lib\site-packages (from packaging>=21.3->statsmodels>=0.9.0->plotly_express) (3.0.9)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.26.4
    Uninstalling numpy-1.26.4:
      Successfully uninstalled numpy-1.26.4
Note: you may need to restart the kernel to use updated packages.
ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'C:\\Users\\drake\\anaconda3\\Lib\\site-packages\\~~mpy.libs\\libopenblas64__v0.3.23-293-gc2f4bdbb-gcc_10_3_0-2bde3a66a51006b2b53eb373ff767a3f.dll'
Consider using the `--user` option or check the permissions.

In [2]:
pip install pyLDAvis
Requirement already satisfied: pyLDAvis in c:\users\drake\anaconda3\lib\site-packages (3.4.1)
Requirement already satisfied: setuptools in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (63.4.1)
Requirement already satisfied: joblib>=1.2.0 in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (1.2.0)
Requirement already satisfied: gensim in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (4.1.2)
Requirement already satisfied: funcy in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (2.0)
Requirement already satisfied: numexpr in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (2.8.3)
Requirement already satisfied: jinja2 in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (2.11.3)
Requirement already satisfied: scikit-learn>=1.0.0 in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (1.2.0)
Requirement already satisfied: scipy in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (1.9.1)
Requirement already satisfied: numpy>=1.24.2 in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (1.24.4)
Requirement already satisfied: pandas>=2.0.0 in c:\users\drake\anaconda3\lib\site-packages (from pyLDAvis) (2.2.2)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\drake\anaconda3\lib\site-packages (from pandas>=2.0.0->pyLDAvis) (2.8.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\drake\anaconda3\lib\site-packages (from pandas>=2.0.0->pyLDAvis) (2024.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\drake\anaconda3\lib\site-packages (from pandas>=2.0.0->pyLDAvis) (2022.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\drake\anaconda3\lib\site-packages (from scikit-learn>=1.0.0->pyLDAvis) (2.2.0)
Requirement already satisfied: smart-open>=1.8.1 in c:\users\drake\anaconda3\lib\site-packages (from gensim->pyLDAvis) (5.2.1)
Requirement already satisfied: MarkupSafe>=0.23 in c:\users\drake\anaconda3\lib\site-packages (from jinja2->pyLDAvis) (2.0.1)
Requirement already satisfied: packaging in c:\users\drake\anaconda3\lib\site-packages (from numexpr->pyLDAvis) (21.3)
Requirement already satisfied: six>=1.5 in c:\users\drake\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas>=2.0.0->pyLDAvis) (1.16.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\drake\anaconda3\lib\site-packages (from packaging->numexpr->pyLDAvis) (3.0.9)
Note: you may need to restart the kernel to use updated packages.
In [3]:
pip install wordcloud
Requirement already satisfied: wordcloud in c:\users\drake\anaconda3\lib\site-packages (1.9.3)
Requirement already satisfied: pillow in c:\users\drake\anaconda3\lib\site-packages (from wordcloud) (9.2.0)
Requirement already satisfied: numpy>=1.6.1 in c:\users\drake\anaconda3\lib\site-packages (from wordcloud) (1.24.4)
Requirement already satisfied: matplotlib in c:\users\drake\anaconda3\lib\site-packages (from wordcloud) (3.5.2)
Requirement already satisfied: packaging>=20.0 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib->wordcloud) (21.3)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib->wordcloud) (1.4.2)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib->wordcloud) (4.25.0)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.8.2)
Requirement already satisfied: cycler>=0.10 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib->wordcloud) (0.11.0)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib->wordcloud) (3.0.9)
Requirement already satisfied: six>=1.5 in c:\users\drake\anaconda3\lib\site-packages (from python-dateutil>=2.7->matplotlib->wordcloud) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
In [4]:
pip install --upgrade pandas numpy
Requirement already satisfied: pandas in c:\users\drake\anaconda3\lib\site-packages (2.2.2)
Requirement already satisfied: numpy in c:\users\drake\anaconda3\lib\site-packages (1.24.4)
Collecting numpy
  Using cached numpy-1.26.4-cp39-cp39-win_amd64.whl (15.8 MB)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2.8.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2024.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2022.1)
Requirement already satisfied: six>=1.5 in c:\users\drake\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.24.4
    Uninstalling numpy-1.24.4:
      Successfully uninstalled numpy-1.24.4
Successfully installed numpy-1.26.4
Note: you may need to restart the kernel to use updated packages.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
daal4py 2021.6.0 requires daal==2021.4.0, which is not installed.
scipy 1.9.1 requires numpy<1.25.0,>=1.18.5, but you have numpy 1.26.4 which is incompatible.
numba 0.55.1 requires numpy<1.22,>=1.18, but you have numpy 1.26.4 which is incompatible.
In [5]:
pip install --upgrade pandas
Requirement already satisfied: pandas in c:\users\drake\anaconda3\lib\site-packages (2.2.2)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.22.4 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (1.26.4)
Requirement already satisfied: pytz>=2020.1 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2022.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2024.1)
Requirement already satisfied: six>=1.5 in c:\users\drake\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
In [6]:
pip install --upgrade pandas seaborn
Requirement already satisfied: pandas in c:\users\drake\anaconda3\lib\site-packages (2.2.2)
Requirement already satisfied: seaborn in c:\users\drake\anaconda3\lib\site-packages (0.13.2)
Requirement already satisfied: numpy>=1.22.4 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (1.26.4)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2.8.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2024.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\drake\anaconda3\lib\site-packages (from pandas) (2022.1)
Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in c:\users\drake\anaconda3\lib\site-packages (from seaborn) (3.5.2)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.25.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (9.2.0)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.0.9)
Requirement already satisfied: packaging>=20.0 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (21.3)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.2)
Requirement already satisfied: cycler>=0.10 in c:\users\drake\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.11.0)
Requirement already satisfied: six>=1.5 in c:\users\drake\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Note: you may need to restart the kernel to use updated packages.

Exploring Public Sentiment on Twitter: An NLP Approach¶

Part 1: Setting Up¶

In [7]:
# Importing InteractiveShell from IPython.core.interactiveshell module
from IPython.core.interactiveshell import InteractiveShell

# Setting the ast_node_interactivity option of InteractiveShell to "all"
# This allows IPython to display results for all statements in a code cell
# rather than just the last one, which is the default behavior.
InteractiveShell.ast_node_interactivity = "all"

import warnings
import matplotlib

# Suppress the specific MatplotlibDeprecationWarning
warnings.filterwarnings("ignore", category=matplotlib.cbook.MatplotlibDeprecationWarning)

Task 1: Import the Libraries¶

In [8]:
import os  
import re 
import pandas as pd  
import numpy as np 
import seaborn as sns  
import matplotlib.pyplot as plt 
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer, PorterStemmer
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from sklearn.decomposition import LatentDirichletAllocation
from sklearn.feature_extraction.text import CountVectorizer
import plotly_express as px
from gensim.corpora import Dictionary
from gensim.models.ldamulticore import LdaMulticore
from gensim.models.coherencemodel import CoherenceModel
import pyLDAvis.gensim
from wordcloud import WordCloud
import nbconvert
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
C:\Users\drake\anaconda3\lib\site-packages\pandas\core\computation\expressions.py:21: UserWarning: Pandas requires version '2.8.4' or newer of 'numexpr' (version '2.8.3' currently installed).
  from pandas.core.computation.check import NUMEXPR_INSTALLED
C:\Users\drake\anaconda3\lib\site-packages\pandas\core\arrays\masked.py:60: UserWarning: Pandas requires version '1.3.6' or newer of 'bottleneck' (version '1.3.5' currently installed).
  from pandas.core import (
C:\Users\drake\anaconda3\lib\site-packages\scipy\__init__.py:155: UserWarning: A NumPy version >=1.18.5 and <1.25.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
In [9]:
nltk.download('punkt')
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\drake\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
Out[9]:
True
In [10]:
nltk.download('stopwords')
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\drake\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
Out[10]:
True
In [11]:
nltk.download('wordnet')
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\drake\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
Out[11]:
True
In [12]:
nltk.download('omw-1.4')
[nltk_data] Downloading package omw-1.4 to
[nltk_data]     C:\Users\drake\AppData\Roaming\nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!
Out[12]:
True
In [13]:
nltk.download('vader_lexicon')
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\drake\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!
Out[13]:
True

Part 2: Data Collection and Preprocessing¶

Task 2: Load the Dataset and Have a First Look¶

Load the CSV into a variable named df_twitter¶
In [14]:
df_twitter = pd.read_csv("covid19_twitter_dataset.csv",index_col=0)
Display the DataFrame along with the row count¶
In [15]:
df_twitter.head()
df_twitter.shape
Out[15]:
user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date text hashtags source is_retweet language lat long country
0 Tom Basile 🇺🇸 new york, ny Husband, Father, Columnist & Commentator. Auth... 2009-04-16 20:06:23 2253 1677 24 True 2020-07-25 12:27:17 Hey @Yankees @YankeesPR and @MLB - wouldn't it... NaN Twitter for Android False en 40.712728 -74.006015 United States
1 Time4fisticuffs pewee valley, ky #Christian #Catholic #Conservative #Reagan #Re... 2009-02-28 18:57:41 9275 9525 7254 False 2020-07-25 12:27:14 @diane3443 @wdunlap @realDonaldTrump Trump nev... ['COVID19'] Twitter for Android False en 38.310625 -85.487459 United States
2 DIPR-J&K jammu and kashmir 🖊️Official Twitter handle of Department of Inf... 2017-02-12 06:45:15 101009 168 101 False 2020-07-25 12:27:08 25 July : Media Bulletin on Novel #CoronaVirus... ['CoronaVirusUpdates', 'COVID19'] Twitter for Android False en 33.664930 75.162958 India
3 🎹 Franz Schubert новоро́ссия 🎼 #Новоро́ссия #Novorossiya #оставайсядома #S... 2018-03-19 16:29:52 1180 1071 1287 False 2020-07-25 12:27:06 #coronavirus #covid19 deaths continue to rise.... ['coronavirus', 'covid19'] Twitter Web App False en 43.341088 132.625674 Россия
5 Creativegms dhaka,bangladesh I'm Motalib Mia, Logo -Logo Designer - Brandin... 2020-01-12 09:03:01 241 1694 8443 False 2020-07-25 12:26:50 Order here: https://t.co/4NUrGX6EmA\r\n\r\n#lo... ['logo', 'graphicdesigner', 'logodesign', 'log... Twitter Web App False en 23.764402 90.389015 বাংলাদেশ
Out[15]:
(111973, 17)
Show the tweet count for the top 10 countries¶
In [16]:
df_twitter['country'].value_counts()[:10]
Out[16]:
country
United States     41931
India             19473
United Kingdom    11544
Canada             6679
Australia          4370
Nigeria            2632
South Africa       2415
Éire / Ireland     1545
Kenya              1493
中国                 1141
Name: count, dtype: int64
Plot the top 20 users who post the most¶
In [17]:
df_twitter['user_name'].value_counts()[:20].plot(kind='barh')
Out[17]:
<AxesSubplot:ylabel='user_name'>

Task 3: Basic Text Preprocessing¶

Remove unnecessary columns for the analysis¶
In [18]:
df_twitter.drop(['user_description','user_created','user_favourites','language'],axis=1,inplace=True)
Check for missing values and handle them¶
In [19]:
df_twitter.isnull().sum()

df_twitter['hashtags'].fillna("[]",inplace=True)

df_twitter.isnull().sum()
Out[19]:
user_name             0
user_location         0
user_followers        0
user_friends          0
user_verified         0
date                  0
text                  0
hashtags          32184
source                0
is_retweet            0
lat                   0
long                  0
country               0
dtype: int64
C:\Users\drake\AppData\Local\Temp\ipykernel_1236\1331467284.py:3: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df_twitter['hashtags'].fillna("[]",inplace=True)
Out[19]:
user_name         0
user_location     0
user_followers    0
user_friends      0
user_verified     0
date              0
text              0
hashtags          0
source            0
is_retweet        0
lat               0
long              0
country           0
dtype: int64
Convert date column to datetime object and extract features¶
In [20]:
df_twitter['date'] = pd.to_datetime(df_twitter['date'])
df_twitter['year'] = df_twitter['date'].dt.year
df_twitter['month'] = df_twitter['date'].dt.month
df_twitter['day'] = df_twitter['date'].dt.day
df_twitter['hour'] = df_twitter['date'].dt.hour
df_twitter['day_of_week'] = df_twitter['date'].dt.dayofweek
Apply basic_clean_text() function to text column¶
In [21]:
def basic_clean_text(text):
    text=text.lower()
    text = re.sub(r" +", ' ', text, flags=re.MULTILINE)
    text = re.sub(r"http\S+|www\S+|https\S+", '', text, flags=re.MULTILINE)
    text = text.replace('<.*?','')
    text = text.replace('[^A-Za-z0-9 ]+', '') 
    text = re.sub(r'[^\w\s]', '', text)
    text = re.sub(r'\d+', '', text)
    return text

df_twitter['text'] = df_twitter['text'].apply(basic_clean_text)

Task 4: Implement Advanced Text Preprocessing¶

Apply advanced_text_preprocessing() function to text column¶
In [22]:
def advanced_text_preprocessing(text):
    tokens = word_tokenize(text)

    stop_words = set(stopwords.words('english'))
    filter_tokens = [word for word in tokens if word.lower() not in stop_words]

    lemmatizer = WordNetLemmatizer()
    lemmatized_tokens = [lemmatizer.lemmatize(word) for word in filter_tokens]

    preprocessed_text = " ".join(lemmatized_tokens)
    
    return preprocessed_text

df_twitter['text'] = df_twitter['text'].apply(advanced_text_preprocessing)

Part 3: Sentiment Analysis¶

Task 5: Perform Sentiment Analysis with Vader Library¶

Perform sentiment analysis with vader library¶
In [23]:
sid = SentimentIntensityAnalyzer()

df_twitter["sentiment_scores"] = df_twitter['text'].apply(lambda x:sid.polarity_scores(x))
Display a random sample of 10 tweets with their sentiment scores¶
In [24]:
df_twitter[['text','sentiment_scores']].sample(10).values
Out[24]:
array([['aplusk realdonaldtrump patron saint covid',
        {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}],
       ['although school year begin little differently school supply still needed shop taxfree essent',
        {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}],
       ['yall thought sharknado joke wait election month come fr sharkweek shark',
        {'neg': 0.0, 'neu': 0.82, 'pos': 0.18, 'compound': 0.296}],
       ['centuryold advice also applies covid maskup first one maskupamerica',
        {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}],
       ['confidence like muscle use stronger get sundaythoughts covid',
        {'neg': 0.0, 'neu': 0.373, 'pos': 0.627, 'compound': 0.8126}],
       ['updated table new case recovered active case covid twithaca',
        {'neg': 0.0, 'neu': 0.748, 'pos': 0.252, 'compound': 0.4019}],
       ['jharsuguda district report covid positive case last hour',
        {'neg': 0.0, 'neu': 0.66, 'pos': 0.34, 'compound': 0.5574}],
       ['chicago death last month covid homicide gun chicagosmayors absur',
        {'neg': 0.474, 'neu': 0.526, 'pos': 0.0, 'compound': -0.743}],
       ['senschumer democrat caused democrat bed china amp ccp covid',
        {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}],
       ['promise promise told guy id beat abhishek bachchan testing covid negative',
        {'neg': 0.227, 'neu': 0.491, 'pos': 0.282, 'compound': -0.0258}]],
      dtype=object)

Task 6: Classify the Tweets into Positive, Neutral and Negative¶

Classify the tweets into categories of positive, negative, or neutral sentiment¶
In [25]:
threshold_value = 0.0
df_twitter['sentiment'] = df_twitter['sentiment_scores'].apply(lambda x:'positive'
 if x['compound']>threshold_value else ('neutral' if x['compound'] == threshold_value else 'negative'))

print (df_twitter['sentiment'].value_counts()) 
sentiment
positive    45225
negative    33807
neutral     32941
Name: count, dtype: int64

Part 4: Trend Analysis and Visualization¶

Task 7: Display the Evolution of Sentiment Over Time¶

Extract the top 3 countries to be used as a filter¶
In [26]:
most_active_countries = df_twitter['country'].value_counts().nlargest(3).index.tolist()
most_active_countries
Out[26]:
['United States', 'India', 'United Kingdom']
Filter the DataFrame¶
In [27]:
filtered_data = df_twitter[df_twitter['country'].isin(most_active_countries)]
filtered_data.shape
Out[27]:
(72948, 20)
Create sentiment over time by country based on groupby¶
In [28]:
sentiment_over_time_by_country = filtered_data.groupby([pd.Grouper(key='date', freq='D'), 'country'])['sentiment'].value_counts().unstack().fillna(0).reset_index()

# Now melt the DataFrame
sentiment_melted = sentiment_over_time_by_country.melt(id_vars=['date', 'country'], value_vars=['negative', 'neutral', 'positive'], var_name='sentiment', value_name='count')
Plot the top three countries sentiments over time¶
In [29]:
for country in most_active_countries:
    _ = plt.figure(figsize=(15,6))
    _ = sns.lineplot(data=sentiment_melted[sentiment_melted['country']==country],x='date',y='count',hue='sentiment')
    _ = plt.title(f'Sentiment Counts Over Time for {country}')
    _ = plt.show()

Task 8: Use Wordcloud to Visualize Words Used in Sentiments¶

Define the create_word_cloud() function¶
In [30]:
def create_word_cloud(sentiment):
    text=" ".join(df_twitter[df_twitter['sentiment']==sentiment]['text'].values)

    wordcloud = WordCloud(width=800,height=400,background_color='white').generate(text)
    plt.figure(figsize=(15,6))
    plt.imshow(wordcloud,interpolation='bilinear')
    plt.axis("off")
    plt.title(f'Word Cloud for {sentiment.capitalize()} Sentiments')
    plt.show()
Creating word cloud for positive sentiment¶
In [31]:
create_word_cloud('positive')
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:523: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  Image.ROTATE_90)
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:523: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  Image.ROTATE_90)
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:523: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  Image.ROTATE_90)
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
Creating word cloud for negative sentiment¶
In [32]:
create_word_cloud('negative')
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
Creating word cloud for neutral sentiment¶
In [33]:
create_word_cloud('neutral')
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:522: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = (Image.ROTATE_90 if orientation is None else
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90
C:\Users\drake\anaconda3\lib\site-packages\wordcloud\wordcloud.py:499: DeprecationWarning: ROTATE_90 is deprecated and will be removed in Pillow 10 (2023-07-01). Use Transpose.ROTATE_90 instead.
  orientation = Image.ROTATE_90

Task 9: Display the Sentiment on a Geographical Heatmap¶

Create mapping from positive, neutral and negative to an numerical value for visualization purpose¶
In [34]:
sentiment_mapping = {'positive':1,'neutral':0,'negative':-1}
df_twitter['sentiment_value'] = df_twitter['sentiment'].map(sentiment_mapping)
Visualize a geographical heatmap of all sentiment on the map¶
In [35]:
fig = px.density_mapbox(df_twitter, lat='lat', lon='long',
                        z='sentiment_value', radius=20,
                        center=dict(lat=df_twitter.lat.mean(), 
                                    lon=df_twitter.long.mean()), 
                        zoom=4,
                        mapbox_style="open-street-map", 
                        height=900)
fig.show()
C:\Users\drake\anaconda3\lib\site-packages\plotly\io\_renderers.py:395: DeprecationWarning:

distutils Version classes are deprecated. Use packaging.version instead.

C:\Users\drake\anaconda3\lib\site-packages\plotly\io\_renderers.py:395: DeprecationWarning:

distutils Version classes are deprecated. Use packaging.version instead.

Part 5: Topic Modeling¶

Task 10: Train the LDA (Gensim) Model¶

Preprocess the text¶
In [36]:
df_twitter['text_tokens'] = df_twitter['text'].str.lower().str.split()
Create a dictionary¶
In [37]:
id2word = Dictionary(df_twitter['text_tokens'])
Filter extremes¶
In [38]:
id2word.filter_extremes(no_below=2, no_above=.99)
Create a corpus¶
In [39]:
corpus = [id2word.doc2bow(d) for d in df_twitter['text_tokens']]
Instantiate an LDA model¶
In [40]:
base_model = LdaMulticore(corpus=corpus, num_topics=5, id2word=id2word, workers=12, 
passes=5)

Task 11: Evaluate the Model¶

Print the topics¶
In [43]:
words = [re.findall(r'"([^"]*)"',t[1]) for t in base_model.print_topics()]

topics = [" ".join(t[0:10]) for t in words]

for id,t in enumerate(topics):
    print(f"------ Topic {id} ------")
    print(t, end="\n\n")
------ Topic 0 ------
covid people u one trump dont amp death coronavirus american

------ Topic 1 ------
covid coronavirus school test amp health tested student people pandemic

------ Topic 2 ------
covid vaccine u help spread risk case health even amp

------ Topic 3 ------
covid amp mask time pandemic people get like new go

------ Topic 4 ------
covid case new death india total coronavirus day positive last

Compute the perplexity and coherence score¶
In [44]:
base_perplexity = base_model.log_perplexity(corpus)
print('\nPerplexity: ', base_perplexity) 

coherence_model = CoherenceModel(model=base_model, texts=df_twitter['text_tokens'], 
                                   dictionary=id2word, coherence='c_v')
coherence_lda_model_base = coherence_model.get_coherence()
print('\nCoherence Score: ', coherence_lda_model_base)
Perplexity:  -8.4076177396794

Coherence Score:  0.2775571594826565
C:\Users\drake\anaconda3\lib\site-packages\scipy\sparse\_sputils.py:43: DeprecationWarning:

np.find_common_type is deprecated.  Please use `np.result_type` or `np.promote_types`.
See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information.  (Deprecated NumPy 1.25)

C:\Users\drake\anaconda3\lib\site-packages\scipy\sparse\_sputils.py:43: DeprecationWarning:

np.find_common_type is deprecated.  Please use `np.result_type` or `np.promote_types`.
See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information.  (Deprecated NumPy 1.25)

C:\Users\drake\anaconda3\lib\site-packages\scipy\sparse\_sputils.py:43: DeprecationWarning:

np.find_common_type is deprecated.  Please use `np.result_type` or `np.promote_types`.
See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information.  (Deprecated NumPy 1.25)

C:\Users\drake\anaconda3\lib\site-packages\scipy\sparse\_sputils.py:43: DeprecationWarning:

np.find_common_type is deprecated.  Please use `np.result_type` or `np.promote_types`.
See https://numpy.org/devdocs/release/1.25.0-notes.html and the docs for more information.  (Deprecated NumPy 1.25)

Task 12: Classify Twitter Tweets into Topics¶

Topic classification function¶
In [48]:
def classify_tweet(tweet):
    processed_tweet = tweet.lower().split()
    tweet_bow = id2word.doc2bow(processed_tweet)
    topic_probabilities = base_model.get_document_topics(tweet_bow)
    most_likely_topic = max(topic_probabilities,key = lambda x:x[1])
    return most_likely_topic[0]
Classify all tweets¶
In [49]:
df_twitter['topic'] = df_twitter['text'].apply(lambda x: classify_tweet(x)) 
Examine topic distribution¶
In [50]:
df_twitter.topic.value_counts()
Out[50]:
topic
0    26494
4    23301
1    21552
3    21144
2    19482
Name: count, dtype: int64

Part 6: Interpretation of Results¶

Task 13: Identify Relationships between Sentiment and Topic¶

Grouping and aggregating data by topic and sentiment¶
In [52]:
grouped = df_twitter.groupby(['topic','sentiment']).size().unstack(level='sentiment')
print(grouped)
sentiment  negative  neutral  positive
topic                                 
0             10947     6214      9333
1              5507     6800      9245
2              4638     5998      8846
3              5394     5832      9918
4              7321     8097      7883
Calculating proportions¶
In [53]:
percent_grouped = grouped.divide(grouped.sum(axis=1),axis=0)
print(percent_grouped)
sentiment  negative   neutral  positive
topic                                  
0          0.413188  0.234544  0.352268
1          0.255522  0.315516  0.428963
2          0.238066  0.307874  0.454060
3          0.255108  0.275823  0.469069
4          0.314193  0.347496  0.338312
Visualizing the results¶
In [54]:
palette = {'positive': '#66BB6A', 'neutral': '#BDBDBD', 'negative': '#EF5350'}
colors = [palette[col] for col in percent_grouped.columns]

percent_grouped.plot(kind='bar', stacked=True, color=colors)
plt.xlabel('Topic')
plt.ylabel('Proportion of Tweets')
plt.title('Proportion of Sentiments by Topic')
plt.legend(loc='upper right')
plt.show()
Out[54]:
<AxesSubplot:xlabel='topic'>
Out[54]:
Text(0.5, 0, 'Topic')
Out[54]:
Text(0, 0.5, 'Proportion of Tweets')
Out[54]:
Text(0.5, 1.0, 'Proportion of Sentiments by Topic')
Out[54]:
<matplotlib.legend.Legend at 0x14d13766a30>

Task 14: Interpret the Topic Modeling Results¶

Creating topic distance visualization¶
In [55]:
pyLDAvis.enable_notebook()
pyLDAvis.gensim.prepare(base_model,corpus,id2word)
Out[55]:

Task 15: Compile your Findings into a Final Report with NBConvert¶

Execute the command within the provided notebook cell¶
In [57]:
!jupyter nbconvert --to html ../CovidSentimentAnalyzer.ipynb --output-dir="C:\Users\drake\Desktop\Jupyter notebook"
This application is used to convert notebook files (*.ipynb)
        to various other formats.

        WARNING: THE COMMANDLINE INTERFACE MAY CHANGE IN FUTURE RELEASES.

Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePreprocessor.enabled=True]
--allow-errors
    Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
    Equivalent to: [--ExecutePreprocessor.allow_errors=True]
--stdin
    read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
    Equivalent to: [--NbConvertApp.from_stdin=True]
--stdout
    Write notebook output to stdout instead of files.
    Equivalent to: [--NbConvertApp.writer_class=StdoutWriter]
--inplace
    Run nbconvert in place, overwriting the existing notebook (only 
            relevant when converting to notebook format)
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory=]
--clear-output
    Clear output of current file and save in place, 
            overwriting the existing notebook.
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --ClearOutputPreprocessor.enabled=True]
--no-prompt
    Exclude input and output prompts from converted document.
    Equivalent to: [--TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True]
--no-input
    Exclude input cells and output prompts from converted document. 
            This mode is ideal for generating code-free reports.
    Equivalent to: [--TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=True --TemplateExporter.exclude_input_prompt=True]
--allow-chromium-download
    Whether to allow downloading chromium if no suitable version is found on the system.
    Equivalent to: [--WebPDFExporter.allow_chromium_download=True]
--disable-chromium-sandbox
    Disable chromium security sandbox when converting to PDF..
    Equivalent to: [--WebPDFExporter.disable_sandbox=True]
--show-input
    Shows code input. This flag is only useful for dejavu users.
    Equivalent to: [--TemplateExporter.exclude_input=False]
--embed-images
    Embed the images as base64 dataurls in the output. This flag is only useful for the HTML/WebPDF/Slides exports.
    Equivalent to: [--HTMLExporter.embed_images=True]
--log-level=<Enum>
    Set the log level by value or name.
    Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
    Default: 30
    Equivalent to: [--Application.log_level]
--config=<Unicode>
    Full path of a config file.
    Default: ''
    Equivalent to: [--JupyterApp.config_file]
--to=<Unicode>
    The export format to be used, either one of the built-in formats
            ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'slides', 'webpdf']
            or a dotted object name that represents the import path for an
            ``Exporter`` class
    Default: ''
    Equivalent to: [--NbConvertApp.export_format]
--template=<Unicode>
    Name of the template to use
    Default: ''
    Equivalent to: [--TemplateExporter.template_name]
--template-file=<Unicode>
    Name of the template file to use
    Default: None
    Equivalent to: [--TemplateExporter.template_file]
--theme=<Unicode>
    Template specific theme(e.g. the name of a JupyterLab CSS theme distributed
    as prebuilt extension for the lab template)
    Default: 'light'
    Equivalent to: [--HTMLExporter.theme]
--writer=<DottedObjectName>
    Writer class used to write the 
                                        results of the conversion
    Default: 'FilesWriter'
    Equivalent to: [--NbConvertApp.writer_class]
--post=<DottedOrNone>
    PostProcessor class used to write the
                                        results of the conversion
    Default: ''
    Equivalent to: [--NbConvertApp.postprocessor_class]
--output=<Unicode>
    overwrite base name use for output files.
                can only be used when converting one notebook at a time.
    Default: ''
    Equivalent to: [--NbConvertApp.output_base]
--output-dir=<Unicode>
    Directory to write output(s) to. Defaults
                                  to output to the directory of each notebook. To recover
                                  previous default behaviour (outputting to the current 
                                  working directory) use . as the flag value.
    Default: ''
    Equivalent to: [--FilesWriter.build_directory]
--reveal-prefix=<Unicode>
    The URL prefix for reveal.js (version 3.x).
            This defaults to the reveal CDN, but can be any url pointing to a copy 
            of reveal.js. 
            For speaker notes to work, this must be a relative path to a local 
            copy of reveal.js: e.g., "reveal.js".
            If a relative path is given, it must be a subdirectory of the
            current directory (from which the server is run).
            See the usage documentation
            (https://nbconvert.readthedocs.io/en/latest/usage.html#reveal-js-html-slideshow)
            for more details.
    Default: ''
    Equivalent to: [--SlidesExporter.reveal_url_prefix]
--nbformat=<Enum>
    The nbformat version to write.
            Use this to downgrade notebooks.
    Choices: any of [1, 2, 3, 4]
    Default: 4
    Equivalent to: [--NotebookExporter.nbformat_version]

Examples
--------

    The simplest way to use nbconvert is

            > jupyter nbconvert mynotebook.ipynb --to html

            Options include ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'slides', 'webpdf'].

            > jupyter nbconvert --to latex mynotebook.ipynb

            Both HTML and LaTeX support multiple output templates. LaTeX includes
            'base', 'article' and 'report'.  HTML includes 'basic', 'lab' and 
            'classic'. You can specify the flavor of the format used.

            > jupyter nbconvert --to html --template lab mynotebook.ipynb

            You can also pipe the output to stdout, rather than a file

            > jupyter nbconvert mynotebook.ipynb --stdout

            PDF is generated via latex

            > jupyter nbconvert mynotebook.ipynb --to pdf

            You can get (and serve) a Reveal.js-powered slideshow

            > jupyter nbconvert myslides.ipynb --to slides --post serve

            Multiple notebooks can be given at the command line in a couple of 
            different ways:

            > jupyter nbconvert notebook*.ipynb
            > jupyter nbconvert notebook1.ipynb notebook2.ipynb

            or you can specify the notebooks list in a config file, containing::

                c.NbConvertApp.notebooks = ["my_notebook.ipynb"]

            > jupyter nbconvert --config mycfg.py

To see all available configurables, use `--help-all`.

[NbConvertApp] WARNING | pattern '../CovidSentimentAnalyzer.ipynb' matched no files

CONGRATULATIONS¶